AITopics

Genre: Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.93)
Automobiles & Trucks (0.93)

Technology:

Information Technology > Sensing and Signal Processing (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.92)

Neural Information Processing SystemsJun-12-2026, 16:56:42 GMT

TGA: True-to-Geometry Avatar Dynamic Reconstruction

Recent advances in 3D Gaussian Splatting (3DGS) have improved the visual fidelity of dynamic avatar reconstruction. However, existing methods often overlook the inherent chromatic similarity of human skin tones, leading to poor capture of intricate facial geometry under subtle appearance changes. This is caused by the affine approximation of Gaussian projection, which fails to be perspective-aware to depth-induced shear effects. To this end, we propose True-to-Geometry Avatar Dynamic Reconstruction (TGA), a perspective-aware 4D Gaussian avatar framework that sensitively captures fine-grained facial variations for accurate 3D geometry reconstruction. Specifically, to enable color-sensitive and geometry-consistent Gaussian representations under dynamic conditions, we introduce Perspective-Aware Gaussian Transformation that jointly models temporal deformations and spatial projection by integrating Jacobian-guided adaptive deformation into the homogeneous formulation. Furthermore, we develop Incremental BVH Tree Pivoting to enable fast frame-by-frame mesh extraction for 4D Gaussian representations. A dynamic Gaussian Bounding Volume Hierarchy (BVH) tree is used to model the topological relationships among points, where active ones are filtered out by BVH pivoting and subsequently re-triangulated for surface reconstruction. Extensive experiments demonstrate that TGA achieves superior geometric accuracy.

artificial intelligence, name change, proceedings, (4 more...)

Technology: Information Technology > Artificial Intelligence (0.40)

Neural Information Processing SystemsFeb-16-2026, 02:31:01 GMT

GSGAN: Adversarial Learning for Hierarchical Generation of 3D Gaussian Splats

Figure 1: Generated examples from the proposed method (FFHQ-512, AFHQ-Cat-512).

artificial intelligence, machine learning, natural language, (17 more...)

Country: Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Neural Information Processing SystemsFeb-14-2026, 05:48:42 GMT

GaussianGraphNetwork: LearningEfficientand GeneralizableGaussianRepresentationsfrom Multi-viewImages

Weconduct experiments on the large-scale RealEstate10K and ACID datasets to demonstrate the efficiency and generalization of our method.

artificial intelligence, arxivpreprintarxiv, justification, (15 more...)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Vision (0.30)

Neural Information Processing SystemsFeb-7-2026, 22:44:01 GMT

BatchNormalizationOrthogonalizesRepresentations inDeepRandomNetworks

More precisely, under a mild assumption, we prove that the deviation of the representations from orthogonality rapidly decays with depth up to a term inversely proportional to the network width. This result has two main implications: 1) Theoretically, as the depth grows, the distribution of the representation -after the linear layers-contracts to a Wasserstein-2 ball around an isotropic Gaussian distribution.

artificial intelligence, machine learning, representation, (17 more...)

Country: Europe > Latvia > Lubāna Municipality > Lubāna (0.05)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.71)

arXiv.org Artificial IntelligenceDec-11-2025

D$^2$GSLAM: 4D Dynamic Gaussian Splatting SLAM

Zhu, Siting, Huang, Yuxiang, Wu, Wenhua, Jiang, Chaokang, Chen, Yongbo, Chen, I-Ming, Wang, Hesheng

Recent advances in Dense Simultaneous Localization and Mapping (SLAM) have demonstrated remarkable performance in static environments. However, dense SLAM in dynamic environments remains challenging. Most methods directly remove dynamic objects and focus solely on static scene reconstruction, which ignores the motion information contained in these dynamic objects. In this paper, we present D$^2$GSLAM, a novel dynamic SLAM system utilizing Gaussian representation, which simultaneously performs accurate dynamic reconstruction and robust tracking within dynamic environments. Our system is composed of four key components: (i) We propose a geometric-prompt dynamic separation method to distinguish between static and dynamic elements of the scene. This approach leverages the geometric consistency of Gaussian representation and scene geometry to obtain coarse dynamic regions. The regions then serve as prompts to guide the refinement of the coarse mask for achieving accurate motion mask. (ii) To facilitate accurate and efficient mapping of the dynamic scene, we introduce dynamic-static composite representation that integrates static 3D Gaussians with dynamic 4D Gaussians. This representation allows for modeling the transitions between static and dynamic states of objects in the scene for composite mapping and optimization. (iii) We employ a progressive pose refinement strategy that leverages both the multi-view consistency of static scene geometry and motion information from dynamic objects to achieve accurate camera tracking. (iv) We introduce a motion consistency loss, which leverages the temporal continuity in object motions for accurate dynamic modeling. Our D$^2$GSLAM demonstrates superior performance on dynamic scenes in terms of mapping and tracking accuracy, while also showing capability in accurate dynamic modeling.

artificial intelligence, machine learning, representation, (14 more...)

2512.09411

Country: Asia > China (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)

arXiv.org Artificial IntelligenceDec-3-2025

Beyond Pixels: Efficient Dataset Distillation via Sparse Gaussian Representation

Jiang, Chenyang, Li, Zhengcen, Zhao, Hang, Shan, Qiben, Wu, Shaocong, Su, Jingyong

Dataset distillation has emerged as a promising paradigm that synthesizes compact, informative datasets capable of retaining the knowledge of large-scale counterparts, thereby addressing the substantial computational and storage burdens of modern model training. Conventional approaches typically rely on dense pixel-level representations, which introduce redundancy and are difficult to scale up. In this work, we propose GSDD, a novel and efficient sparse representation for dataset distillation based on 2D Gaussians. Instead of representing all pixels equally, GSDD encodes critical discriminative information in a distilled image using only a small number of Gaussian primitives. This sparse representation could improve dataset diversity under the same storage budget, enhancing coverage of difficult samples and boosting distillation performance. To ensure both efficiency and scalability, we adapt CUDA-based splatting operators for parallel inference and training, enabling high-quality rendering with minimal computational and memory overhead. Our method is simple yet effective, broadly applicable to different distillation pipelines, and highly scalable. Experiments show that GSDD achieves state-of-the-art performance on CIFAR-10, CIFAR-100, and ImageNet subsets, while remaining highly efficient encoding and decoding cost. Our code is available at https://github.com/j-cyoung/GSDatasetDistillation.

artificial intelligence, dataset distillation, machine learning, (14 more...)

2509.26219

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

arXiv.org Artificial IntelligenceNov-25-2025

Rad-GS: Radar-Vision Integration for 3D Gaussian Splatting SLAM in Outdoor Environments

Xiao, Renxiang, Liu, Wei, Zhang, Yuanfan, Chen, Yushuai, Chen, Jinming, Wang, Zilu, Hu, Liang

We present Rad-GS, a 4D radar-camera SLAM system designed for kilometer-scale outdoor environments, utilizing 3D Gaussian as a differentiable spatial representation. Rad-GS combines the advantages of raw radar point cloud with Doppler information and geometrically enhanced point cloud to guide dynamic object masking in synchronized images, thereby alleviating rendering artifacts and improving localization accuracy. Additionally, unsynchronized image frames are leveraged to globally refine the 3D Gaussian representation, enhancing texture consistency and novel view synthesis fidelity. Furthermore, the global octree structure coupled with a targeted Gaussian primitive management strategy further suppresses noise and significantly reduces memory consumption in large-scale environments. Extensive experiments and ablation studies demonstrate that Rad-GS achieves performance comparable to traditional 3D Gaussian methods based on camera or LiDAR inputs, highlighting the feasibility of robust outdoor mapping using 4D mmWave radar. Real-world reconstruction at kilometer scale validates the potential of Rad-GS for large-scale scene reconstruction.

artificial intelligence, point cloud, radar point cloud, (14 more...)

doi: 10.1109/LRA.2025.3630875

2511.16091

Country: Asia > China (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

arXiv.org Artificial IntelligenceNov-4-2025

GauDP: Reinventing Multi-Agent Collaboration through Gaussian-Image Synergy in Diffusion Policies

Wang, Ziye, Kang, Li, Qin, Yiran, Ma, Jiahua, Peng, Zhanglin, Bai, Lei, Zhang, Ruimao

Recently, effective coordination in embodied multi-agent systems has remained a fundamental challenge, particularly in scenarios where agents must balance individual perspectives with global environmental awareness. Existing approaches often struggle to balance fine-grained local control with comprehensive scene understanding, resulting in limited scalability and compromised collaboration quality. In this paper, we present GauDP, a novel Gaussian-image synergistic representation that facilitates scalable, perception-aware imitation learning in multi-agent collaborative systems. Specifically, GauDP constructs a globally consistent 3D Gaussian field from decentralized RGB observations, then dynamically redistributes 3D Gaussian attributes to each agent's local perspective. This enables all agents to adaptively query task-critical features from the shared scene representation while maintaining their individual viewpoints. This design facilitates both fine-grained control and globally coherent behavior without requiring additional sensing modalities (e.g., 3D point cloud). We evaluate GauDP on the RoboFactory benchmark, which includes diverse multi-arm manipulation tasks. Our method achieves superior performance over existing image-based methods and approaches the effectiveness of point-cloud-driven methods, while maintaining strong scalability as the number of agents increases.

artificial intelligence, arxiv preprint arxiv, representation, (16 more...)

2511.00998

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

arXiv.org Artificial IntelligenceOct-29-2025

GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving

Liu, Shuai, Liang, Quanmin, Li, Zefeng, Li, Boyang, Huang, Kai

Multi-sensor fusion is crucial for improving the performance and robustness of end-to-end autonomous driving systems. Existing methods predominantly adopt either attention-based flatten fusion or bird's eye view fusion through geometric transformations. However, these approaches often suffer from limited interpretability or dense computational overhead. In this paper, we introduce GaussianFusion, a Gaussian-based multi-sensor fusion framework for end-to-end autonomous driving. Our method employs intuitive and compact Gaussian representations as intermediate carriers to aggregate information from diverse sensors. Specifically, we initialize a set of 2D Gaussians uniformly across the driving scene, where each Gaussian is parameterized by physical attributes and equipped with explicit and implicit features. These Gaussians are progressively refined by integrating multi-modal features. The explicit features capture rich semantic and spatial information about the traffic scene, while the implicit features provide complementary cues beneficial for trajectory planning. To fully exploit rich spatial and semantic information in Gaussians, we design a cascade planning head that iteratively refines trajectory predictions through interactions with Gaussians. Extensive experiments on the NAVSIM and Bench2Drive benchmarks demonstrate the effectiveness and robustness of the proposed GaussianFusion framework. The source code will be released at https://github.com/Say2L/GaussianFusion.

artificial intelligence, information fusion, machine learning, (16 more...)

2506.00034

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Robotics & Automation (0.94)
Automobiles & Trucks (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.93)